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article, material or device Intended or adapted for use in war (Official Secrets Acts, 191 1 and 1920). In 
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prohibiting publication or communication has been given or any such direction has been received. 



firmed OMA tiflutneti rillttd to thm Qinomtc PMA a* Iwmoh- 
dtnootthvmociitid-vlrui (LAY) infl Pfflttiru encoded fry 

,»<rt LAV o.ncmie RMA 

The invention relatea to cloned DNA itqutnctt 
5 indistinguishable from genomic RNA and ONA of lymphs - 
denopathy-aeeociated virus (LAV) , a proceaa for their 
preparation and thoir uaea. It rolatoa mora particularly 
to stable probea Including a ONA ••qutnet which can be 
uttd for tha dataction of tha LAV virua or related vlrusas 
to or ONA proviruaea in any medium, particularly biological 
samples containing any of them. The invention also relates 
to polypeptides, whether glycosylated or hot. encoded by 
said DNA sequences. 

Lymphadenopathy-a ssociated virua (LAV) ia a human 
15 retrovirua firat isolated from the lymph node of a homo* 
sexual patient with lymphadenopa thy syndrome, frequently s 
prodrome or a benign form of acquired immune deficiancy 
syndrome (AIDS). Subsequently other LAV isolates have bean 
recovered from patienta with AtOS or pre-AIDS. All avaiia- 
20 oia data are conaistent with tha virus being the causative 
agent of AIDS, 

A method for eloning such ONA sequences has sires* 
dy been discloaed in 8ritiah Patent Application Nr. 
S4 23«3t filed on September 19. 1994. Reference la hara- 
ss after made to that application as concerns subject matter 
in common with the further improvements to the invention 
disclosed herein. 

The present invention aims at providing additional 
new means which should not only also be useful for tha 
30 detection of LAV or related vlruaea (hereafter mora 
generally referred to aa "LAV viruses-), but also hsvs 
more versatility, particularly in detecting specific parts 
of the genomic ONA of said viruses whose expression pro- 
ducts are not always directly detectable by immunological 
35 methods. 

Tho present invention furthor alma at providing 



polypeptides containing itqutncn in common with polypep- 
tides encoded by the LAV genomic SNA. It relates even more 
particularly to polypeptides comprising antigenic deter- 
minants included in the proteins encoded and expressed by 
the LAV genome occuring in nature. An additional object of 
the invention is to further provide means for the 
detection of proteins related to LAV virus, particularly 
for the diagnosis of aids or pre-AJOS or. to the contrary, 
for the detection of antibodies against the LAV virus or 
proteins related therewith, particularly in patients 
afflicted with AIDS or pre-AIOS or more generally in 
asymtomatic carriers and in blood-related products. 
Finally the invention also aims at providing immunogenic 
polypeptides, and more particularly protective 
polypeptides for use in the preparation of vaccine 
compositions against AIDS or related syndrome. 

The present invention relates to additional DMA 
fragment a, hybridizabie with the genomic ftMA of LAV as 
they will be disclosed hereafter, aa well aa with additio- 
nal cONA variants corresponding to the whole genomes of 
LAV viruses. It further relates to ONA recombinants con- 
taining seid DMAs or cDWA fragments. 

The invention relates more particularly to e cONA 
variant corresponding to the whole of LAV retroviral 
genomes , which is characterized by a series of restriction 
sites in thg order hereafter (from the 5* end to the 3* 
end ) . 

The coordinates of the successive sites of the 
whole LAV genome (restriction map) are Indicated t\mr»mffr 
too. with respect to the Hind It! site (selected aa of 
coordinate 1) which is located in the ft region. The 
coordinates ere estimated with an accuracy of t 200 bp t 
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Another 0NA variant according to thla invention 
optionally containa an additional Hind XXX approMimataly 
20 at tho 5 550 coordlnato. 

Rtf«rtnct ia furthor mado to fig. 1 which ahowa a 
mora dotailod raatrietion map of aaid wholo-OMA (AJ1S). 

An ovon moro dotailod nuclootidio ttqutnct of a 
prof trrtd DMA according to tho invontlon la shown in fig. 
25 4-12 horoaftor. 

Tha invantlon furthor ralataa to othar proforrad 
0NA fragmanta which will bo rafarrad to horoaftor. 

Additional faaturaa of tha invantion will appaar 
in tho count of tha non~limitatlvo dlacloaura of addltio- 
30 nal foaturoa of prafarrad DMAs of tho Invantion, aa wall 
aa of prafarrod polypaptidaa according to tha invantlon. 
Roforonco will furthar bo had to tho drawinga in which : 
- ' fig, 1 ia tho raatrietion map of a comploto LAV ganoma 
(clono AJ19) i 

35 - figs. 2 and 3 show diagrammatically parta of tho thraa 
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possible reading phases of LAV genomic RNA, including the 
open reading frames (OBF) apparent in each of said reading 
phases ; 

figa. 4*12 show the successive nucleotidlc sequences of 
5 a complete LAV genome* The possible peptidie sequences in 
relation to the three possible reading phases related to 
the nucleotidie sequences shown are alao indicated i 

figa. 13- 1ft reiterate the sequence of part of the LAV 
genome containing the genea coding for the enveloppe pro- 

10 tains, with particular boned peptidie sequences which cor- 
responds to groupa which normally carry glycoayl groups. 

Ths> sequencing and determination of sites of par- 
ticular interest wea carried out on a phage recombinant 
corresponding to AJ19 disclosed in the abovesaid British 

15 Patent application Nr. 04 23659. A method for preparing it 
ia disclosed in that application. 

The whole recombinant phage DMA of clone AJ19 
(disclosed in the earlier application) waa sonicated 
according to the protocol of DI1MIN6EB M963), Analytical 

10 Biochem. 129. 216. the DMA waa repaired by a Kienow 
reaction for 12 hours at 16*C. The DMA waa electrophoresed 
through 0*6 1 agarose gel and DMA in the else range of 
360-600 bp waa cut out and eleetroeluted and precipitated, 
ftesuspended 0NA (in 10 mH Trie. pH 6 t 0,1 mH £0TA ) was 

25 llgated into Ml3mp6 BF DMA (cut by the restriction enayme 
Smal and subsequently alkaline phoaphatedl. using T4 ona* 
and BNA-ligeaea (Maniatia T et al (19621 - Molecular 
cloning - Cold Spring Harbor Laboratory). An £. col* 
strain designated as TGI waa uaed for further study. This 

30 strain haa the following genotype : 

Mac pro. eupcT, thi . F * tra036 , proAft, lael*. 2AM19,r 

Thia JL. eoli T< »1 atrain haa the peculiarity of 
enabling recomblnanta to be recognised easily. The blue 
colour of the cells trsnafected with plaamlda which did 

39 
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not feeombine with • fragment of LAV ONA is not modified. 
To the contrary colls transfected by a recombinant plasmid 
containing a LAV ONA fragment yiold whito colonies. Tfte 
ttchnxqut which waa uaod ia diaeloaod in Gene (1983). 2S . 
5 101. 

Thia atrain waa tranaformad with tho ligation mix 
uaing tho Hanahan mat hod (Hanahan 0 (19631 J. Hoi. Biol. 
18B. 357). Calls were platod out on tryptene-agaroee plata 
with IPTG and X-gal in soft agarose. Whito plaques wars 

10 either picked and scroonotf or scroonod diroctly An titu 
uaing nitrocalluloso filters. Thoir ONAs woro hybrldizao 
with nick-tranalated ONA inaorta of pUC18 Hind III 
subclonos of AJ19. thia permitted tho isolation of tho 
plasmida or subclonos of A which aro identified in tho 

15 tablo haroaftor. Zn relation to this tablo it should also 
bo notod that tho designation of each platmid is followed 
by the deposition number of a cell culture of £. enii TGI 
containing the corresponding platmid at the 'Collection 
Natlonale des Culturea do Micro-organismes* (C.N. C.M.J of 

20 the Paateur Xnatltute in Parla, Prance. A non-tranaf ormed 
TGI cell line waa alao deposited at the C.N.C.H. under Nr. 
1-364. All these depoaita took place on November 19, 1984. 
The sizes of the corresponding inserta derived from the 
LAV genome have also been indicated. 

25 
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Easantial faaturaa of tha recombinant plaamids 
pJ19 - 1 plaamid (I-38S) 0.5 tcb 

Hind ZXZ - Sac X - Hind XIX 
pJ19 - 17 plaamid (1-367) o.g ko 

Hind XXX - Pat 1 - Hind XXI 

PJ19 - 6 plaamid (1-366) 1.3 kb 

Hind XXI (9* ) 

8am HI 

Xho I 

Kpn X 

S0l XX 

Sac X (3* I 

Hind III 

pJ19-13 plaamid (1-360) 6.7 kb 

Hind III <5* ) 

001 IX 

Kpn I 

Kpn X 

Ceo RI 

Eco 01 

Sal X 

Kpn I 

091 H 

Bgl IX 

Hind III (3*) 



M ; methionine 
W : tryptophan 
F: phenylalanine 

Y : tyrosine 
L : leucine 

V s valine 

I : isoleucine 
6 : glycine 
T : threonine 
$ : serine 
E : glutamic acid 
0 i Aspertlc acid 
M : asparaglne 
Q $ glutamine 
P : proline. 

The asterlk signs " •* correspond to stop codons 
(i.e. TAA, TAG and TGA). 

Starting above the first lino of the DMA 
nucleotldlc sequence of fig* t the thro* reading phases 
are respectively marked m 2 m , *3' # on the left 

handelde of tho drawing. Tho samo rolatlve proaontatlon of 
the throo theoritical roading phases is then usod all over 
the suceossivos linos of tho LAV nucleotldlc soquonco. 

Figs. 2 and 3 provido a diagrammatized represen* 
tatlon of tho lengths of the successive open reading 
frames corresponding to the successive reading phases 
(also referred to by numbers ~2* and "3" appearing in 

the left handside part of fig. 2). The relative positions 
of these open reading framea ( ORF ) with respect to the 
nucleotldlc structure of the LAV genome ia referred to by 
the scale of number* representative of the respective 
positions of the corresponding nucleotides in the DNA 
sequence. The vertical bars correspond to the positions of 
the corresponding stop codons. 

1 ) The 'aifl amnm* far PPF-osql 

The "gag gene" codes for core proteins. 
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Particularly it appears that a gtnomie fragment (ORF-gag) 
thought to coda for tha cort antigons including tho p23, 
p18 and p13 protoina la locatad botwoon nuclootidic 
position 236 (atarting with 5* CTA 6C6 GAG 3*1 and 
5 nuclootidic poaitlon 1759 (anding by CTC6 TCA CAA )'). Tha 
structure of tha paptidaa or protoina ancodad by porta of 
aald ORF ia doomod to bo that corroaponding to phaao 2. 

Tho mathionino aminoacid *M* codod by tho AT6 at 
poaitlon 260-282 ia tho probablo initiation mothionina of 
10 tho gag protoln procuraor. Tho and of ORP-gag and 
accordingly of gag protoin appoara to bo locatod at 
poaitlon 1759. 

Tho beginning of p25 protoin, thought to atart by 
a P-I-V-0-N-I-0-0-Q-M-V-H .... aminoacid aoquonco ia 
15 thought to bo codod for by tho nuclootidic aoquonco 
CCTATA * . . , atarting at position 659. 

Hydrophillc poptldos in tho gag opon raading framo 
aro idontifiod horoaftor. Thoy aro dofinod atarting from 
aminoacid 1 a Mot CHI codod by tho AT6 atarting from 260-2 
20 in tho LAV DMA aoquonco. 

Thoeo hydrophillc poptldos aro 
12*32 aminoacida incluaivo 
37-46 
49-79 
25 66-153 
156-165 
176-186 
200-220 
226-234 
30 239-284 
266-331 
352*381 
377-390 
399-432 
35 437-464 
492-498 
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The invention alto relstee to any combination of 
thoae peptidee. 

2) th> 'pal oono» tor ORF-oal) 

Fic»* 4-12 alao show that tha ONA fragments 
5 extending from nucleotidlc position 1535 (atarting with 
5 * TTT TTT ....3' to nucleotidlc position SOii la thought 
to eorroapond to tha pol gang. The poXypaptidic etrueture 
of tho corresponding polypeptidea la daamod to bo that 
corraaponding to phaao 1. It atopa at poaitlon 4503 (and 
10 by 5*6 GAT GAG GAT 3' ) . 

Thata gonoa ara thought to coda for tho virua 
polymoraao or ravarsa tranacriptaaa . 

3) THm ift>/ilQM nene for OPF-.nv) 

Tho ONA itqutnct thought to codo for anvalopa 
15 protaina la thought to extend from nuclaotidic poaitlon 
56TO (atarting with 5 * AAA GAG GAG A 3*) up to nuclao- 
tidic poaitlon $132 (anding by ....A ACT AAA GAA 3'). 
Polypeptidie structures of isqusncn of tho anvalopa 
protoin eorroapond to those road according to tho "phase 
20 3" reading phaao. 

Tho start of env tranacriptlon is thought to be at 
tho level of th ATG codon at poaitiona 3001-5653. 

Additional feature of the envelope protein coded 
by the anv genea appear on figs. 13*11. These are to be 
25 considered aa paired figa. 13 and 14 i 15 and It t 1? and 
10 respectively. 

It is to be mentioned that because of format 
difficultios. 

Fig* 14 overlapa to some extent with fig. 13. 
30 Fig. 10 overlapa to some extent with fig. 15. 

Fig. 10 overlaps to some extent with fig. IT. 
Thus for instance figa. 13 and 14 muat be con- 
sidered together. Particularly the aequence shown 1 on the 
firat line on the top of fig. 13 overlapa with the 
33 aequence ahown on the firat line on the top of fig. 14. In 
other words thostsrting of the reading of tho successive 
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•tqutneti of the env gene at represented in figa. 13- ia 
involves firtt reading the first line at the top of fig. 
13 then proceeding further with the first line of fig. 14. 
One then returns to the beginning of the second line of 
5 fig. 13 * th«n again further proceed with the reading of 
the second line of page 14. etc.. The same observations 
then apply to the reading of the paired figs. 1S and 16, 
and paired figs. 17 and IS. respectively. 

The locations of neutralizing epitopes are further 
10 apparent in figs. 13-18. reference is more particularly 
made to the bowed groups of three letters included in the 
aminoacid sequences of the envelope proteins (reading 
phase 3) which can be designated generally by the formula 
N-X-S or N-X-T, wherein X ia any other possible aminoacid. 
13 Thus the initial protein product of the env gene in a gly- 
coprotein of molecular weight in excess of 91,000. These 
groups are deemed to generally carry glycosylated groups. 
These N-X-S and N-X-T groupa with attached glyeosylsted 
groups form together hydrophylic regions of the protein 
20 and are deemed to be located at the periphery of and to be 
exposed outwardly with respect to the normal conformation 
of the proteins. Conaequently they are considered as being 
epitopes which can efficiently be brought into play in 
vaccine compositions. 
23 The invention thus concerns with more particulari- 

ty peptide sequences included in the envproteins and 
excizsble therefrom (or having the same aminoacid struc- 
ture), having aizes not exceeding 200 aminoacids. 

Preferred peptides of this invention (referred to 
30 hereafter as a, b, c v d, e, f) are deemed to correspond to 
those encoded by the nucleotide sequences which extend 
respectively between the following positions : 

a) from about 0095 to about 0200 

b) • • 6200 * * 0310 
35 c) * 0390 " ' 6440 

4) - " 6405 " " 6020 
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1 2 

• t m ' 6860 " " 6930 

f) * " 7535 " " 7630 

Othor hydrophilic pap t Idas in tha any opon raiding 
frama art idantlfiod haraaftar. thay aro dofinad starting 
from 

aminoacid 1 * lysina IK) codad by tha AAA at position 
5670-2 in tha LAV OMA soquanca. 

Thasa hydrophilic pvptid** irs 
6-23 aminoacida inclusive 

63-76 

62-90 

«7-1?1 
127-183 
197-201 
239-294 
300-32? 
334*381 
397-424 
469-500 
510-523 
SS1-577 
594-603 
621-830 
6S7-679 
719-756 
760-603 

Tht> invantion alto relates to any combination of 
these peptides. 

4) Tht Qtntr QRr 

Th9> invantion furthar eonetrni 0NA sequences which 
provide open reeding frames dafinad as OftF-0, ORF-lt and as 
•1". •2", # 3*» "4*. *5*. the relative position of which 
appears mora particularly in figs. 2 and 3* 

These OftFa hsve tha following locations : 
ORF-0 phaso 1 start 4476 stop 5086 

0RF-R 2 • * 




13 

5029 " 3316 

5273 - 5515 

9363 ' 9516 

5519 • 5773 

7966 " 6279 

The LTR (long terminal repeats) can be defined at 
lying between position 6560 and position 160 (end exten- 
ding over position 9097/1). As a matter of fact the end of 
the genome is at 9097 and. because of the UTft structure of 
to the retrovirus, links up with the beginning of the 
sequence : 

Hind IH 
CTCAATAAAGCTTSCCTT6 

n 

15 909T 1 

The invention concerns more particularly all the 
ONA fragments which neve been more specifically referred 
to hereabove and which correspond to open reading frames. 
It will be understood that the man skilled in the art will 

20 be able to obtain them all. for instance by cleaving an 
entire ONA corresponding to the complete genome of a LAV 
species* such as by cleavage by a partial or complete 
digestion thereof with a suitable restriction enzyme and 
by the subsequent recovery of the relevant fragments. The 

25 different ONAs disclosed in the earlier mentioned British 
Application can be resorted to also aa a aouree of sui- 
table fragments. The techniques disclosed hereabove for 
the isoletion of the fragments which were then included in 
the plasmids referred to hereabove and which were then 

30 used for the ONA sequencing can be used. 

Of course other methods can be used* Som# of them 
have been exemplified in the earlier Oritiah Application, 
reference is for instance made to the following methods. 

a) DNA csn be transfected into mammalian ceil* 

35 with appropriate selection markers by a variety of tec- 
hniques, calcium phosphate precipitation. polyethylene 
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ORP-1 
ORP-2 
0R*-3 

5 _0R*-3 



glycol, protoplast-f ution , itc, 

b) DMA fragments corresponding to gonos can bo 
cloned into oxprossion vtetori for £. eQl\ 4 yoaat* or 
mammalian colls and tho resultant protolna purified* 
5 c) Tha provlval DNA can bo ~ shot-gunned* (frag- 

mented) into proearyotlc expression vectors to generate 
fusion polypoptidoa . Recombinant producing antigonically 
compotont fusion proteina can bo idontifiod by » imply 
acrooning tho racomblnanta with antibodioa against LAV 

to antigona . 

Tho invontlon also rolatoa mora spocifieally to 
clonod proboa which can bo mado atartlng from any ONA 
fragment according to this invention, thus to racombinant 
ONAs containing such fragments, particularly any plasmid* 

15 amplifiablo in procaryotic or oucaryotic calls and carry- 
ing said fragments. 

Using tho clonod DNA frogmenta as a molecular hy- 
bridisation probo - oithor by marking with radionucleo- 
tidos or with fluoroacont roagonta - LAV virion UNA may bo 

20 dotoctod diroctly in tho blood, body fluids and blood 
products (e.g. of tho antihomophylie factors such as 
factor Vtlt coneentratee) snd vaccinas, l.o. hopatitis a 
vaeeino. It has alroady boon shown that wholo virus can bo 
dotoctod in culturo supornatants of LAV producing colls. A 

25 suitablo mothod for achioving that dotoetion comprises 
immobilising virus onto said a support o.g* nitroeolluloso 
filtors, ttc, disrupting tho virion and hybridising with 
labollod (rmdiolabollod or "cold" fluoroacont- or 
onzymo-labollod) proboa. Such an approach haa alroady boon 

30 dovolopod for Hopatitis 9 virus in poriphoral blood 
(according to SCOTTO J. ot si. Hopttology (1913). 3. 
379-394). 

Prnh»4 j»r.rnrrting to tho invontlon can alto bo v»ad 
for rapid acrooning of gonomic DMA dorivod from tho tlitut 
35 of pationts with LAV rolatod symptoms, to soo if tho oro- 
viral ONA or RMA is prosont in host tlsauo and othtr 



15 

tissues. 

A method which can be used for such iereening 
comprU* the following stops t extraction of ONA from tis- 
tut, restriction enzyme cleavage of said ONA , electro- 
5 phoreaia of tho fragments and Southorn blotting of gonomic 
ONA from tiaauoa, tubttqutnt hybridization with labelled 
clonod LAV provival DNA . Hybridization la «itu can alao bo 
uaod. 

Lymphatic fluida and tiaauoa and othor non-lympha- 
10 tic tiaauoa of humans, primates and othor mammalian 
apocioa ean alao bo acroonod to see if othor ovolutionnary 
rolatod rotrovlrua exist. Tho mothods roforrod to here- 
abovo can bo uaod. although hybridization and waahinga 
would bo dono undor non atringont conditions. 
IS Tho ONA according to tho invention can bo used 

alao for achieving tho expression of LAV viral antlgons 
for diagnostic purposes. 

Tho invention alao rolatoa to the polypeptides 
themselves which ean be expressed by the different ONAs of 
the inventions. particularly by the ORFs or fragments 
thereof, in appropriate hosts, particularly procaryotic or 
eucaryotic hoata. after tranaf ormation thereof with a 
suitable vector previously modified by the corresponding 
ONAs . 

These polypeptides can be uaod aa diagnostic 
tools. particularly for the detection of antibodies in 
biological media, particularly, in aera or tissues of 
persons afflicted with pre-AIDS or AXOS. or simply 
carrying antibodiea in the abaence of any apparent 
disorders. Conversely the different peptides according to 
thia invention can be uaod themaelvea for the production 
of antibodiea. preferably monoclonal antibodiea specific 
of the different peptides respectively. For the production 
of hybridomaa secreting aaid monoclonal antibodies 
conventional production and screening methods are used. 
These monoclonal antibodiea. which themaelvea are part of 
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the invention than provide very useful tools for the 
identification and even determination of relative 
proportions of the different polypeptides or proteins in 
biological samples, particularly human samples containing 
5 LAV or related viruses. 

Thus six of the above peptides can be used in 
diagnostics ss sources of immunogens or antigens free of 
viraX particles, produced using ^non-permissive systems, 
and thus of little or no blohazard risk* 
10 The invention further relates to the hosts (proca- 

ryotic or eucaryotie cells) which are transformed by the 
above mentioned recombinants snd which are capabXe of 
expressing said ONA fragment a. 

Finally it aXao reXatea to vaccine compositions 

19 whose active prlncipXe ia to be constituted by any of the 
expressed antigens, i.e. whole antigens, fusion polypep- 
tides or oligopeptides in association with a suitable 
phermaceuticsX or physioXogicaXXy acceptable carrier. 

Preferably the active principles to be considered 

20 in that field consist of the peptides containing less than 
250 aminoacid unite, preferably less than 150 as deducible 
for the complete genomes of LAV • and even more preferably 
those peptides which conteln one or more groups selected 
from N-X-S and N-X-T as defined above. Preferred peptidea 

25 for uae in the production of vaccinating prinelplea are 
peptidea (at to if) aa defined above. By way of example 
having no limitative character, there may be mentioned 
that suitable dosages of the vaccine compositions are 
thoae which enable adminlatration to the host. 

30 psrticularly humsn host ranging from 10 to 500 micrograms 
per kg, for instance 80 to 100 micrograms per kg. 

For the purpose of clarity fige. IS to 26 are 
added. reference may be made thereto in caae of difficul- 
tiea of reading blurred parte of figa. 4 to 12. 
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Needless to <ty that figs. 19*28 an merely a 
reiteration of the whole ONA sequence of the LAV 9tnomi. 

Finally the invention also concirm victors for 
the transformation of oucaryotle cells of human origin, 
particularly lymphoeytas , tho polymerases of which ara 
capable of recognizing tho LTRa of LAV. Particularly said 
vtetori ara characterized by tho proaanca of a LAV LTft 
therein* said LT* baing than actlvo aa a promotar anabling 
tho affieiont transcription and translation in a sultabla 
hoat of tho above dofinod. of a ONA insart coding for a 
dotarminod protoin placad under its controls. 

Needless to say that tha invantion extends to all 
variants of gonomoa and corroaponding ONA fragmonta (ORFs) 
having substantially equivalent properties, all of said 
gonomoa belonging to retroviruses which can be considered 
aa equivalenta of LAV . 
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CLAIMS 

1. A ONA fragmont of LAV axtandlng from nocUotidt 
position 236 to nuelootld* position 1739. 

2. A ONA fragmont of LAV sxtonding from nuclaotids 
position 1335 to nuelootldo position 5086. 

3. A ONA fragmont of LAV axtandlng from nuclootida 
position 3670 to nuelootldo position 8132. 

4. A vector containing s ONA fragmont according to 
any of claims 1 to 3 . 

5. Poptido corroapondlng to any of thoao oneoded 
by tho nuelootldo itqutnec* which oxtond rospoetlvoly 
batwoon tho following positions : 

a) from about 6093 to about 6200 

b) • * 6260 * " 6310 
H C) - * 8390 " ' 6440 

d) 6463 " " 6620 

o) • • 6660 - • 6930 
f) * • 7533 " * 7630 

6. Poptldo charactorisod by a soquonco of amino- 
20 acids doduciblo from LAV ONA tho torminal aminoacids of 

which axtand botwoon tho following poaitions with raspact 
to tho lyslno (position 1) codod by tho AAA at position 
5670-9672 in tho LAV ONA , 

6-23 aminoacids inclusivo 
23 «3-T0 
62-90 
97-123 " 
127-183 " 
197-201 " 
30 239-294 " 

300-327 - 
334-361 * 
397-424 " 
466-300 ' 
36 310-523 * 

531-577 " 
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994-603 
621-830 
697-670 
719-790 
790-603 

or any combination of thase paptidaa. 

7. Poptido corresponding to 
ttquaneti daducible from LAV DMA and 
aminoacida of which are poaitionned at 
hereafter counted from the Hat at poaition 1 codad by the 
ATG itqutnct at nucleotide poaitiona 260-2 : 



tha aminoacid 
the tarminal 
the poaitiona 



12-32 aminoacida inclu 
37-46 



49-70 

96-193 
196-16S 
176-166 
200-220 
226-234 
239-264 
266-331 
392-361 
377-390 
399-432 
437-464 
492-496 

and combination of said peptides. 

6. Diagnostic main* containing any 
fragment a of any of claims 1 to 3. 

9. Diagnoatie moans containing any of tha peptides 

of any of claims 4 to 6. 

10. Vaccino eompoaitiona containing any of th« 
poptidoa according to any of claima 4 to 6 in association 
with a pharmaceutical vahicle. 



ivo 



of tha DNA 
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. < ; C w : - C H V ? -I L c" F - » K »| P - ; - ., . 

r f. l s * k * s * ; I i< * p u o i o i / s L 

™. .CiGAOr,ACACCAAr.AAATv;«£.\#v.CACTACATCCTACACTACACCCCTCCAACCATCCACCAACTCACCCTAft 
5<!^0 S300 5310 ^3^0 ->330 S3*0 b3t>0 

P$LF"MKSt.*MLLrfOEt4erATKTS 
OVCFTTKALCI SYGRKKKfcORRRPP 
KFVSOQKP • ASP.IAGRS GOSOEOut 
CC A ACT TTCT TTCACAACAAAACCCTTACCCATC tcct AT GGCAGCAAGA ACC CCACAC ACCCACC AACACC rcc* 
5*10 5*20 5*30 5**0 5*50 5*60 5*70 

SrCNATr'TNSNSSI SSSNNNSNSCV 
V H V . * 0 P I Q ! A I A A L V VA I I I A I V V * 

AO rACATr.TAATGCAACCT A TAC A AAT ACC AAT ACC ACC ATTAGT ACT ACC A AT A AT AAT ACC A AT AGTT GTGTG( 
5530 55*0 5550 5560 5570 5560 5590 

*i ' 'J * N ♦ ♦ T N ft K $,* * 0 W 0 ♦ E ♦ R R N I S 
lUkLiDftLteftAEOSCNESECeiSA 
♦ T C ♦ L I 0* ♦KEOKTVAHRVKEKY'JT 
AATAGACAG^TTAATTGAT A CACT AAT AGA AAG ACC AGA ACACAGTCGC A ATG AGAGTGAAGGACA AAT ATC ACC * 
5650 5660 5670 5680 5690 5700 5710 

Y**SVVLOKNCGSQSE«GYLCGRKQ 
I 0 0 U ♦ C T ft K I V G H $ L L rfGTC VEGSN 
L « I C > 5 \ A T EK C W V T V Y Y G V P V W K E A • 
TATTCATCATCTCTACTCCT AC ACAAAAATTCTCCCTC AC ACTCT ATT AT GCCCTACCTCTCTCCAACCA ACCA A( 
5770 57*0 5790 5600 5910 5K20 5330 

K i \ nFGPHUPVY.PGTPTHKK^Y*** 
■CT*CLGHTCLCTH*PQPT*SSIG ' 
V H N V m A T H AC. VP.TO-PNP 0 E V V L V ft \ 
ACCTACATAATCTTTC'vOCCAC AC ATC CCTCTCTACCC AC AC ACCCC a ACCC AC A ACAAC TAG TATTCCT AAA tc t 
5670 5^00 59X0 5920 59 30 59*0 5950 



- s - y ._ r -- 6 — j ~-r->" 1 M V ♦ T4 ♦ P. H S V L V 
A • C Y * 0 F "ft G 5 * P K A * C K I N P T L C 
H E 0 X ISL^OOS.LKPC-VKLTPLCVSL 
rcCATCACCATArAATCAGTTTATCGGATCAAACCCTAAACCCATGTCTAAAATTAACCCCACTCTCTCTTACTTT 
5010 6020 6030 60*0 6050 6060 6070 




ATACCA4TrCTTBT*6CCCCCA»*TG*TG*T6CA6*AACC*C*G4TA»*»**CTCCTCTTTCAAT»TCACCAC»*C 

6130 6l«0 (.150 6160 6170 6180 M90. 

L | * Y Q ♦ I " I LP* I * • a V Y T P U S L m ' 
. Y 4 T 1 « » ■> Y- Y 0 L Y V 3 K L * H L S H Y T C 
0 I I P I 0 h » H T S Y T L T S C ft* T ?) V t T 0 . 
f TO a TA T.' I f ACC A AT *G A T aZTTaTaCj ACC ACCT AT AC 5 T TCAC^a^JTTCT AACA^CTC^ CTCATTACAC*-iCC 
W 50 6260 '^70 6?*0 »«?«0 6300 ''313 

I • ' ' 

P y . y . V F • M .V I I « S 1 * ». » »• V 0 n % * 




16riQV.84-"329099 

■■-sqpktacttcyckkccpmc 

0*vSU<LLVPLAlv<SVAPj A 
C AC v A AG TCACCCTAAAAC TCCTTC TACC ACTTCCTATTCTAAAAACrCTTCCrTTCATTO 
o 5350 5360 5370 5360 5300 5400 



atkt-ssposos-ssfsikavs 

uRRftPPQCSGTHOVSLSKO*V 

SOE0LLKAVRUX<FLYOSSK» 
-iCCCACCAACACCTCCTCAACCCACfCACACTCATCAACTTTCTCTATCAAACCACTAACT 
0 3470 5480 5490 5500 5510 5520 



SMSCVVH5NHR I * E N I KTKK 
IAIV VMSIVIIEYRKILRO RK 

♦ 0*LCGP»*S«NlGKY*DK€K 
T AGCAATAGTTCTGTGGTCCATAGTAATCATACAATATAGGAAAATATTAACACAAAGAAA 
i 5590 5600 5610 5620 5630 5640 



RRNISTCGOGGGNGAPCSLG 
Ge!SALVe«GVe«GHHAJ>MO 
K £ KYOHLWR W G W K W G T H L L G I 
a AGGACA AAT ATCAGC ACTTGTCGAGATGGGGGTGGAAATGGGGCACCATGCTCCTTGGGA 
; 5710 5720 5730 5740 5750 5760 



CCFKGPPLYFVHOILKHMtO 
VEGSNHHSILCI<C*SI*YP 
VWKEAT'TTLFCASOAKAYOTE 
. TOTGGA AGGAACCA ACC ACC ACTCT ATTTTGTGC ATC AGATGCT AAACCATA TG AT AC AC 
5330 5840 5850 5960 5870 5880 



♦ Y*i**#OKILTCGK»TW*NR 
? I C K C 0 *KF*HVSK*HCRTO 
V V L V N V T| E K*F N fl '4 K H 0 .1 V 6 , 0 - H 
. t A G T ATTGGTAAATGTGAC AGAAAATTTT A ACATGT GG AA AAATC ACA TG GTAGAACAGA 
5950 5960 5970 5980 5990 6000 



h" s * v u ' v * *s" ** a x i — a — c * it*' x~ r-* P i - v — Y^ 

TLC^FKVH^FC C C Y » Y 0 • * 

o LC VSLKCT0L6 (N A Tj N V K S $1 
C AC TCT GTGTTAGTTT AAAGTCCACTG ATTTGGGG A'A I L»l I AL ( AAT ACC AATACT AGTA 
6070 6030 6090 6100 6110 6120 



5ISA0A*EV*CP<NHHFFIN 
0 Y 0 _ rtKHK**CAf*!ClFL«T 
c Is i sl TSIRGKVCKEYAFFYKL 
TC A a"t A TC AGCACA AGCAT A AGAGGT AAGGTCC AG A * AGA ATATCCAT TTTTTT ATAAAC 
M')0 6200 6210 *2?0 6230 6240 



aSL"*P*0* YPLSOFPYIIV 
SHYTGLSKG IL^AMSMTLCC 
sjv iTOACPKVSFEPlPlHYCA 
OGTCATTACACAGCCCTGTCCAAACCTATCCTTTCACCC AATTCCC AT ACATT ATTGTC 
f.3\0 6320 6330 *»V.O 6350 6360 



1 5 HOV. 3 V- 233-39 

p c * f c n s >. " * I * • ') v - j » m r t t* 1 a r - 

P 4 C F A I l K C ■* / f < * T ] F fv r, M c P C T f »i v ^ 

CCCCCGCTCr. TTTTCCOArTCT AAAt TCTA ATA/. TAA<,ACCTTC^Ar&CAAC.\CCACC ATCTACAA^T'-.TC AC 
63 70 6340 639Q 6*0.') 6410 6<.20 f.*O0 



CC'.IAV. •OKKR*«'LDLPlSOT*LKI 
A V E. U Q S5RHaGS**IC Q F H « 0 C * .1 

l l N c si laseevvirsa in p n o n a k r 

TOCTCTTCAATCCCACTCT ACCACAACA4CACCTAGTAATTACATCTCCC AATTTCACACACAATOCTAAAACC 
6*90 6500 6510 6520 6530 65*0 6550 



P T T*lwEKVSVSRGOQCCHLLO*E<« 
0 C Of K K K Y P YPECTdfS ICYMPKM 
N M> "M fl R K S [RlORGPGRAFVTIGKl 
CCAACAACAATACAACAAAAACTATCCCTATCC ACA GGCC ACC AC GC AGA GC ATTTG TT AC A AT AC C A 4 A A ATA 
6610 6620 6630 6640 6650 6660 6670 



. * N — -A — — L~ .A — 11 • N U ..E.-..L. .1. £ S L S fc 

C H FKTOS+OIKRTIMK' ♦ ♦ N ;< H ' L ♦ A 
_A T| LKOIASKLReOFGN In X Tl I I F K 0 

A TGCSacTTTA AA AC AGAT AGCTAGC A AATT AA G AGA ACAATTTCG AA4T AATAAAAC A ATA ATCTTT A AGC AA 
6730 6740 6750 6760 6770 6780 6790 



I GNF5TV I OHNCL I V t G . L I Y L G V L X 
W G I F L t • F N T T V • ♦ Y L V ♦ » Y L FY* 
G E F F Y C /N S Tl 0 L F h S Tl W f JN S Tl w S T E 
CAGGGGAATTTTTCT ACTGT AATTC A4C ACA ACTC t TTTA AT AGT ACTTCG,TT TAATAG TAC TTGG^GT AC TG A At 
6850 6860 6870 6890 6890 6900 6910 



£ •nnl*TCGXK*EKOCHPLPSADKL 
NKTIYKHVAG5RKSNVCPSM0RTH* 
IKOFINMWOEVGKAflYAPPlSCOI 
GA AT AA AAC A ATTT ATAAAC ATGTGGC AGG AAG TAGGAAAAGCA ATGT ATGCCCCTCCC ATCAGCCCACAAATT4 
6970 6980 6990 7000 7010 7C20 7C30 



v iTTHGi>RSSDteeei # *GTicevNY 

* ♦ 0 0 W V ROLOTWRRRYEGOLEK* I 1 
N N N ft* G S> eiFRPGGOOnROMWRSEL 
GTAATAAC AACASTCCCTCwGAGATCT TCAGACCTGGAGG AGGAGATATG AGCGAC A ATTGG AGAAGTG AAT TA' 

7090 7100 7110 7120 7130 7140 7150 



PR0Rr6UCR£KKE0M£»ELCSLGSW 
OGKEKSGAcRKKSSGNRSFVPWVL< 
K AKRRV'VORE <R A V G I GAtFLGFt 
CC AAGCCA AACACAAGAGTGGTGC AG AGAGAAAA AAGAGC AGTCGGAATACCAGCTTTGTTCCTTGGGTTCTTCf 
7210 7220 7230 7240 7250 7260 7270 



YRPQNrCLV*CSSRT-fC+GLLRft"S 
TGOTl I VWY SA A a E 0 F A £ G Y * G a ft 
OARQLLSGI Y00 0 MNLLRAIEA09 
TACAGGCCAGACAATTATTGTCTGGTATAGTCCAGCAGCAGAACAATTTGCTCAGCGCTATTGAGGCCCAACACt 
7330 7340 7350 7360 7370 7380 7390 



E$4f.tlX0T»R!:4$SW'6FCV4L6NS 




..\.\c accacc ATCTACAA/.rr.Tc accac at. r ac AiTor ac at at cr. a a n.ir,ccc tag ta tc a ac rc/. AC 

. !) >><.20 ***30 h^d ft <»^0 ' >»'.60 h*i70 >,mo 



p rsoT ML<p»*rs*rML*Ku r v u o 

Q F H KOC^NHWSTA C p t C H N » L Y K T 
f»i r M ONAKTI IVOU N 0 Si V E I /n C O R P 
:C A A rTTCACAGACAATGCTAAAACC ATAATAGT ACAGCTGAACCAATCTCT AGAAATTAAT TGTACA AGAC 
"> 69*0 6650 6560 6570 55HO * 6590 6*00 



^HLL-0»EK*Et •OKMtVTLVFONC 
StCYVPKMRKYETSTL * M * ■ ♦ $ K * F 
AFVTIGKIGN*RQAHC If. ! $| R A K W [jf 
-ACCATTTCTT AC A AT AC C A A A A ATACG A AAT AT G AC AC AAGC AC A TT C T A AC AT T AG T AC AGC A A AA T GGA 
»;) 6660 6670 6680 6690 6 700 6710 6720 



I . ,k P ♦ s l s n ? q g 5 l q_..k jl ? J y _ j, i v 

» ♦ W .VV'L ♦ A I L ft R G P R N C N A 0 F ~ ♦ L W 
/v * rA I [FKOSSGCDPflVTHSFNCG 
T AATAAAACAATAATCTTTAAGCAATCCTCAGGAGGGGACCCACAAATTCTAACCCACACTTTTAATTCTG 
3 6780 6790 6A00 6810 **920 6810 6**0 



G L ivlcvlkgo r TLKFVTOSHSHA 

V»»YL6T*RV K » M ♦ RK*H'1HTPMC 
F Jn S Tl W S T E G $ f4 N Tj EGSOTrrtPCR 
0 tt rAATAGTACTTCGAGT ACTGAAGGCTCA AATAACACTGAACGAAGTCAC ACAATCACACTCCCATCCA 
)' 6900 6910 6920 f.930 69*0 6950 6960 



vptPSAOKLOVHOILOCCY^OEnV 
CP$H.QRTN«»FI * V Y RAAlNKRUrf 
APPISGOIRCSS [n I Tj GLLLTROGG 
T GCCCCTCCCATCAGCCGACAAATTAGATGTTCATCAAATATTACAGCCCTGCTATTAACAACAGATGGTC 
1 7020 7020 70*0 7050 7060 7070 7080 



* GTIGEVNY I NI'K**KL*1H*E * H P 
5 G 0 L E K ♦ T l*I*5SK\'«TtRS STH 

ROMWRSELYKYKVVKIEPLCVAPT 
AGGG AC A ATTGGAGAAGTGAATTATATAA A TATA AAGT ACTA A A AATTGA ACC ATT AGGAGT AGC ACCC A 

■i 71*0 7150 7160 7170 7180 7190 7200 



* eLCSLGS.'U€a3£AL*AHG0«*«R 
^SFVPHVLCSSRKHYCRTVNt)AOG 
C. ALFLGFLGAAGSTrtGARS -lTLTV 
AGGAGCTTTGTTCCTTCGGTTCTTGGGAGCACCAGCAAGCACTATGGGCCCACCCTCAATCACCC TGACGG 
0 7260 7270 7230 7290 7300 7310 7320 



•GLLRRNSICCNSOSGASSSSRO 
AFGY*GATASVATHSLCHOAAPCK 
LRAI-EAOOHLLOLTVUG I K 0 L 0 A R 
-.CTGAGGCCTATTCACGCCCAACAGCATCTCTTGCAACTCACAGTCTCGGCCATCAAGC AGCTCC *CCCAA 
7380 7390 7*00 7*10 7*20 7*30 7**0 



GVALENSF4FLLCUG-. LVCVlNL 



1 s nov. a 2 ? 3.3 g . . 

D F A / 

NPCCGKlPKCS > f> G 0 L C l l ^ K T h 

ILAVESrLKOOuLLGI wCCSCKLl 
GAAfCC TGGCTGTGGA A AGATaCCTAA AG CATC A AC A GCTCCTGCGG A TTTGGGGTTGC T C T GG A A A & C T C A TI 
7*50 7*60 7*70 7*30 7*?0 7*>00 7510 

WNRFGIT*PGWSGTEKLTJT0A»YI 
C T 0 L E ♦ HQ LDGVG JRM^OLHKL NT 

E a I y n 1* i ' d w (i F w o a E I u y\ y~ n s l i h 

TGGAAC AGATTTGGAATAACATCACCTGGATCGAGTGGGACACAGAAATTAACAATTACACAAGCTTAATACAT 
7570 7530 7590 7600 7f>10 7620 7630 

NYnM*lNGQVC5lGLT*Q I G C G I ♦ K 

I it;ia*rtGKFVELv« * m k l a v v y k 

LLELOKWASLWNWF |hT I d M ' W L W Y I K 
A ATTATTGGA ATT AGAT A AATGCG C A AGTT TGTCGAATTGGTTTAACAT AACA AATTGGC TGTGGT ATAT AAAA 
7690 7700 7710 7720 7730 77*0 7750 

LLYFL**IELGRDIHHYRPRPTSQP 
CCTPY$£*S*AGIFTI I V S 0 P P . P N 
A V L S I V R V R 0 G Y S PL S FOT HUP T 

TT GCTGTACTTTCT AT AGTGAATAGAGTTAGGCAGCCATaTTCACC ATTATCGTTTC AGACCC ACCTCCCAACC 
76X0 7020 7830 78*0 7850 7860 7870 

A£TETDPFO**TDP*HLSGTICCAL 

exokoihsis e g l l stylgrsasp 

ROROftSIRLV jN G St LALIWOOtRSL 
AGAGAGACAGAGACAGATCCATTCC ATTAG TGA ACdd A Tic TTAGC ACTTATCTGGGACG ATCTGC.GGAGCCTT 
7930 79*0 7950 7960 7970 7930 7990 

TRIVELLGfcAG'iJEALKYWWNLLOYtf 
RGUrfNF WOAGGGKPSNlGCI SYSI 
FOCGTSCTOGYGSPOILVESPTVC 
ACG AGG ATTGTGGAACT TCT GGGACGC AGGGGGTGGG AAGCCCTC AAATAT TGGTGGAATCTCC T ACAGT ATTGi 

aoso ao60 ao70 aoso aovo aioo 3110 



AlAVAEGTORV IEVVOGACRAIFHI 
» » } « l ft G Q I CL *K*YKELVELF AT 
HSS S*C0R*GY«SSrRSL»5YSPH 
C CC ATAGC 4GTAGCTGAGGGGAC AGAT AGGG TT AT AGAAGT AGT ACA AGG ACCT TGTAGAGCTATTCGCCACAT 
1170 8180 3190 3?00 32X0 3220 3230 

CWQV yiCK«CC v.iA YCKGKNET5»AS 
CGK JSKSSVVG WPTVRERrtRRAEP 
VASG0KVVWU0GLL»C<E*06LS3 
GGGTGGCAAGTGGTCAAAAAGTAGTGTGGTTGGATCCCCT ACTGT AAGGCA A AC A ATCAGACG ACCTC ACCC AC 
8290 8300 8310 0320 3330 33*0 8350 

SNHXtOYSSYQCClCUARSTftGGGC 
AlTSSMTAATNAAC AUUFAQECEE 
OSOVAI 'J3LPXLLVPG » K H K Q R a — S- 
ACCAATCACAAGTAGCAATACAGCAGCTACCAATGCTGCTTCTCCCTGGCT prAAGCACy AGAGGACCAGG AGG 
' ' "8*10 8*20 H*30 8**0 3*50 — " 1 / d*6U 1 ri 5« ' -i ' "~ 
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.* : TCCA A AACTC ATT TCCACC AC TCCTCTCCCTTCCA^TCCTACTTCCAGTAATAAATCTC 
7510 7520 7530 75<*0 7550 7560 



KL M .T F L N ♦ R I AKPARKE*Tft 
$ S L T H 5 L I E E SOVOOEXNEOE 
CAACCTTAATACATTCC T T A A T TC A AC AATCC C A AA ACC ACC A ACA A A AC AATC AACA AG 
7630 76A0 7650 7660 7670 7680 



CGX*KY$*. ♦ ♦♦£AM*V*E#F 

V V YK'M I H N 0 S R R L G R F K N $ F 
. W Y * I K I F I fl I V G C L V G L/R/I V F 
fCTCCTATATAAAAATATTC ATA ATG AT AGTAGG AGGCTTGGTAGCTTTA AG AATAGT TT 
7750 7760 7770 7780 7790 7800 



PTS0PRGDPTGPK6«KKKV6 
P P P N. PECTROARRNRRRRWR 
r H L. P: T PRGPDRPEGI6EEGG6 
C CC ACCTC CC AACCCCC ACGGG ACCCC AC AGGCCCC AAGGA ATAGAAG AACAACGTGGAG 
7870 7830 7390 7900 7910 7920 



ICCALCLFSYHRLRDLLLIV 
SAEPCASSATTA*ETY$ *L* 
LRSLVP.L0LPPt6RLTL.0CN 
y^TCTGC GG AGCCTTGTCCCTCTT C AGCT ACC ACCGCTTGAGACACTTACTCTTG ATTGTA 
7990 8000 8010 8020 8030 30 SO 



LLQYHSOEL-KNSAVSLLNAT 
SYS'lCVftN**lVLLACS*PO 
PTVLESG TKE*CC»LAOCHS 
TCC T ACAGTATTGGAGTCAGGA ACT AAAGAATAGTCCTCTT A6CTTGCTCA ATGCC ACA 
3110 8120 8130 81*0 3150 8160 



A I p H I P R R I R 0 GLE R I L L ♦ D 
L p ATYLEE ♦ORAWKGFCYItrt 
ySPHT*KNKTCLCKOFaI Rw 
CT ATTCC CC AC ATACCT AC AAGAATAAGACACGGCTTGG AA AGGATTTTGCT ATAAGAT 
3230 82*0 8250 3260 8270 8280 



rs#A$S.»*GGSSI5RPGKTW 

r a e p aaogvgaasrolekhg 

gLS001*CW60ML6T«KNH6 
CGAGCTGAGCCAGCAGCAGATGGGGTGCCAGCAGC ATCTCGAGACCTGGAAAAACATCG 
8350 8360 8370 8330 8390 fW>0 



OGGCGGFSSMTSGTFKTN Ot 

p £ E 6 V G F P V T P 0 V P L R P H T T 
o g v„ yFOSHLRVL^OO^LT 



j AGCAGC AGGAGGfGG(TTTTT CC AG TCACACCTCACG TaCC TTTA4GACCAATCACTTA 
^TTh n ^ t»4nn l 4500 *510 «520 



3 t P^l T Vb A V* • 0 L P M 



10 20 30 40 50 6C 




AAGCTTGCCT TGAGTGCTTC AAGTAG TGTG TGCCCGTCTG T T G T G T G AC T CTGGT AACT i 

70 SO *0 100 110 120 

GAGATCCCTC AGACCCTTTT ACTCAGTGTG GAAAATCTCT AGCAGTGCCG CCCCAACAGG 

no 140 150 160 i?a uo 

GACTTGAAAG CGAAAGGGAA ACCAG AGGAG CTCTCTCGAC GCAGGACTCG GCTTGCTGA A 



190 200 210 220 230 240 

GCGCGCACGG CAAGAGGCGA GGGGACGCGA CTGGTGAGTA CGCC AAAAAf TTTGACT AGC 

250 260 270 280 290 300 

GGAGGCT AGA AGGAGAGACA TGGGTGCGAG AGCGTCAGTA TTAAGCGGGG GAGAATTAGA 

310 320 330 340 350 3*0 

rCGATGGGAA AAAATTCGGT TAAGGCCAGG GGGAAAGAAA AAATATAAAT TAAAACATAT 

370 360 390 400 410 420 

AGTATGGGCA AGCACGGACC TAGAACGATT CCCTGTTAAT CCTGGCCTGT TAGAAACATC 

430 440 450 460 470 460 

AGAAGGCTGT agacaaatac tgggacagct acaaccatcc cttcagacag GATCAGAAGA 

490 500 510 520 530 540 

ACTTACATCA TTATATAATA CAGTAGCAAC CCTCTATTCT GTGCATCAAA GGATAGAGAT 

550 560 570 5S0 $90 600 

AAAAGACACC AAGGAAGCTT TAGACAAGAT AGACGAAGAG CAAAACAAAA GTAAGAAAAA 

610 620 630 640 650 660 

AGCACAGCAA GCAGCAGCTG ACACAGGACA CAGCAGCCAC CTCAGCCAAA ATTACCCTAf 

670 680 690* 700 710 720 

AGTGCAGAAC ATCCAGGGGC AAATGGTACA TCAGGCCATA TCACCTAGAA CTTtAAATGC 

730 740 750 760 770 780 

ATCGGTAAAA GTAGTACAAG AGAAGGCTTT CAGCCC AGAA GTCATACCCA TGTTTTCAGC 

790 800 810 820 630 840 

ATT ATCAGAA GGAGCCACCC CACAAGATTT AAACACCATG CTAAACACAG TGGGGGGACA 

830 860 870 860 690 900 

TCAAGCACCC ATGCAAATGT TAAAAGAGAC CATCAATGAG GAAGCTGCAG AATGGGATAG 

910 920 930 940 950 960 

ACTCCATCCA CTGCATGCAG GCCCT ATTGC ACCAGGCCAG ATCAGAGAAC CAAGGGGAAG 

970 960 990 1000 1010 1020 

TGACATAGCA GGAACTACTA CTACCCTTCA GGAACAAATA GGATGGATGA CAAATAATCC 

1030 1040 1050 10*0 1070 1080 

ACCTATCCCA GTAGGAGAAA TTTATAAAAG ATGGATAATC CT6G6ATTAA ATAAAAT AGT 

1090 HOP 1110 1120 "1130 1140 




AAGAATGT AT AGCCCTACCA OCATTCTCCA C AT AACACA A CCACCAAAAC AACCC TT TAG 
1130 1160 1170 1130 1190 i?00 

AC ACTATOTA CACCCGTTCT ATAAAACTCT aagagccgag caagcttcac agcacgtaaa 



1210 

aaattgcatg 

1270 

aaaagcattg 



1220 
ACAGAAACCT 

1280 
GGACC AGCAG 



1230 

tgttggtcca 

1290 

ctacactaoa 



12*0 
AAATGCGAAC 

1300 
AGAAATGATG 



1250 
CCAGATTGTA 

1310 
ACAGCATGTC 



1260 
AGACTATTTT 

1320 
AGGGAGTGGG 



1330 13*0 1350 1360 1370 13*0 

AGGACCCGGC CATAAGGCAA GAGTTTTGCC TGAAGCAATG AGCC AAGTAA CAAATTCAGC 

1390 1*00 1*10 1*20 1*30 1**0 

TACCATAATG ATGCAAAGAG GCAATTTTAG GAACC AAACA AAGATTCTTA AGTGTTTCAA 

1*50 1*60 1*70 1*«0 1*90 1300 

TTGTGGCAAA GAAGGGCACA TAGCCAGAAA TTGCAGGGCC CCTAGGAAAA AGGGCTGTTG 



1510 1520 1530 

GAAATGTGGA AAGGAAGGAC ACCAAATGAA 

1570 1580 1590 

AGGGAAGATC TGGCCTTCCT ACAAGGCAAG 

1630 16*0 1650 

CCCAACAGCC CC ACC AGAAG AGAGCTTCAG 

1690 1700 1710 

GAAGCAGGAG CCG AT AGACA AGGAACTGTA 

1750 1760 1770 

CAACGACCCC TCCTCACAAT AAAGATAGGG 

1310 1820 1830 

GGACCAGATG ATACAGTATT AGAAGAAATC 

1670 I860 1890 

ATAGGGGGAA TTGGAGGTTT TATCAAAGTA 

1930 19*0 1950 

TGTGGACATA AAGCT AT AGO TACAGTATTA 

1990 2000 2010 

AGAAATCTGT TGACTCAGAT TGGTTGCACT 

2050 2060 2070 

GTACCAGTAA AATTAAACCC AGGAATGGAT 

2110 2120 2130 

GAAGAAAAAA TAAAAGCATT AGTACAAATT 

2170 2100 2190 

TCAAAAATTG GGCCTGAAAA TCCATACAAT 

2230 22*0 2250 

AGTACT AAAT GGAGAAAATT AGTAGATTTC 

2290 2300 2310 

TGGCAAGTTC AATTAGGAAT ACCACM$CC 



15*0 


1550 


1560 






PT AATTTTTT 


1600 


16X0 


1620 


vCwAvvvAA 1 


TTT^TT^ACA 
1 1 IWI 1 l*AVA 


tlT A&At? ACA 
WVAWAVW AVA 


1660 


1670 


1660 


vlbl vvvw I m 


ViAvAvAAmA m. 


CTCCCTCTC A 


1720 


1730 


17*0 


T CUT T lAACl 


1 Ubv 1 bAvAl 


C ACTrTTTCfi 

WAW 1 V 1 M WW 


1760 




1 AAA 


VWVbA Aw 1 ** 


A^r.AicrTrr 

AUVAAVb IWI 


ATT AC AT AC A 


16*0 


1650 


1660 


AGTTTGCCAG 


GAAGATGGAA 


ACC AAAAATG 


1900 


1910 


1920 


AGACAGTATG 


ATCAGATACT 


CATAG AAATC 


1960 


1970 


1980 


GTAGGACCTA 


CACCTQTCAA 


CATAATTGGA 


2020 


2030 


20*0 


TTAAATTTTC 


CCATTAGTCC 


TATTGAAACT 


2060 


2090 


2100 


GGCCC AAAAG 


TTAAACAATG 


CCCATTGACA 


21*0 


2150 


2160 


TGTACAGAAA 


TCCAAAAGGA 


AGGGAAAATT 


2200 


2210 


2220 


ACTCCAGTAT 


TTGCCATAAA 


GAAAA AACAC 


2260 


2270 


2280 


AGAGAACTTA 


ATAACAGAAC 


TCAAGACTTC 


2320 


2330 


23*0 


GCAGGGTTAA 


AAAACAAAAA 


ATC AG TA AC A 
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24i(> 2420 2410 

ACTGCATTTA CCAT ACCT AG TaTAAACAAT 

2470 2460 2490 

CTCCTTCCAC AGGGATGGAA AGGATC ACC A 

2530 2940 2550 

TTACACCCTT TTAG A AAACA AAATCC AGAC 

2590 2600 2610 

TATGTAGGAT CTGACTTAGA AATAGG6CAC 

2650 2660 2670 

catctgttga GGTGGGGACT taccacacca 

2710 2720 2730 

CTTTGCATGG GTTATGAACT CCATCCTGAT 

2770 27B0 2790 

OA aaaagaca gctggactgt caatgacata 

2630 2840 2850 

AGTCACATTT ACCCAGCCAT TAAAGTAACG 

2890 2900 2910 

GC ACT AACAG AAGTAATACC ACTAACAGAA 

2950 2960 2970 

GAGAT7CTAA AAGAACC.AGT ACATGGAGTG 

3010 3020 3030 

GAAATACAGA AGCAGGGGCA acgccaatgg 

3070 3060 3090 

aatctgaaaa caggaaaata tgcaagaacg 

3130 3140 3150 

TTAACAGAGG CAGTOCAAAA AATAACCACA 

3190 3200 3210 

AAATTTAAAC TACCCATACA AAACGAAACA 

3250 3260 32?0 

CCCACCTCGA TTCCTCACTC GGACTTTGTC 

3310 3320 3330 

CAGTTAGAGA AAGAACCCAT AGTAGGAGCA 

3370 3360 3390 

AGGGAGACTA AATTAGGAAA ACCAGGATAT 

3430 3440 3450 

ACCCTAACTG ACACAACA AA TCACAACACT 

3490 3500 3510 

GATTCGGGAT f AGAAGTAAA- TATAGTAACA 

3550 3560 3570 

GCACAACCAG AT AAAA6TGA ATCAGAGTTA 

3610 3620 3630 



099 



L T TCCC I TAG 


atgaagactt 


CAGGAAG 1 A I 


2440 


2450 


2460 


GAGACACCAG 


GGATTAGATA 


TCAGTACAAT 


2500 


2510 


2520 


GCAAJ ATTCC 


AAAGTAGC AT 


GACAAAAATC 


2560 


2570 


2560 


ATAGTTATCT 


ATCAATACAT 


GGATG ATTTG 


2620 


2630 


2640 


CATAGAACAA 


AAATAGAGGA 


GCTGAGACAA 


2660 


2690 


zr oo 


GACAAAAAAC 


ATCAGAAAGA 


ACCTCCATTC 


2740 


2750 


2760 


AAATGGACAG 


TACAGCCTAT 


AGTGCTGCCA 


2600 


2010 


2820 


CAGAAGTTAG 


TGGCAAAATT 


GAATTGGGCA 


2660 


2070 


2860 


CAATTATGTA 


AACTCCTf AC 


AGGAACCAAA 


2920 


2930 


2940 


GAAGCAGAGC 


TAGAACTGGC 


AGAAAACAGA 


2980 


2990 


3000 


TATTATGACC 


CATCAAAACA 


CTTAATAGCA 


3040 


3050 


3060 


ACATATCAAA 


TTTATCAAGA 


GCCATTTAAA 


3100 


3110 


3120 


AGGGGTGCCC 


ACACTAATGA 


TGTAAAACAA 


3160 


3170 


3100 


GAAAGCATAG 


TAATATGGGC 


AAA6ACTCCT 


3220 


3230 


3240 


TGGGAAACAT 


GGTGOACAGA 


GTATTGGCAA 


3280 


3290 


3300 


AATACCCCTC 


CTTTACTGAA 


ATTATGGTAC 


3340 


3350 


3360 


GAAACGTTCT 


ATGTAOATGG 


GGCAGCTAGC 


3400 


3410 


3420 


GTTACTAATA 


GAGGAAGACA 


AAAAGTTGTC 


3460 


3470 


3460 


GAGTTACAAG 


CAATTCATCT 


AGCTTTGCAG 


3520 


3530 


3540 


GACTCACAAT 


ATCCATTAC6 


AATCATTCA A 


3580 


3590 


3600 


gtcaatcaaa 


TAATAGAGCA 


CTTAATA AAA 


3640 


3650 


3660 



V W 3670 3f>80 0 F A 369Q 

CTACA7AAAT TAGTCAGTGC TGCAATCAGC 




3730 3740 3750 

CCCCAACATG AAC ATGAGAA ATATCACAGT 

3790 3800 3810 

CTOCCACCTO TACTACC AAA ACAAAT ACTA 



3850 3860 3870 

CAACCCATCC ATGGACA ACT AG ACTGTAGT 

3010 3920 3930 

TTACAACCAA AAGTT ATCCT CGTAGC AGTT 

3970 3980 3990 

GTTATTCCAG CAGAAAC AGG GC AGGAAACA 

4030 4040 4050 

TGGCC AGTAA AAACAATACA TACAGACAAT 

4090 4100 4110 

GCCGCCTGTT GGTGGGCGGG AATCAAGCAG 

4150 4160 4170 

CAAGGAGTAG TAGAATCTAT GAATAAAGAA 

4210 4220 4230 

CAGGCTGAAC ATCTTAAGAC AGCAGTACAA 

4270 4280 4290 

AAAGGGGGGA TTGGGGGGTA CAGTGCAGGG 

4330 4340 4350 

ATACAAACTA AAGAATTACA AAAACAAATT 

4390 4400 4410 

AGGGACAGCA GAGATCC ACT TTGGAAACGA 

4450 4460 4470 

GCAGTAGTAA TACAAGATAA TAGTCACATA 

*510 4520 4530 

ATTAGGGATT ATCCAAAACA GATGGC AGGT 

4570 4580 4590 

GATTAGAACA TGGAAAAGTT TAGTAAAACA 

4630 4640 4650 

ATGGTTTTAT AGACATCACT ATGAAAGCCC 

4690 4700 4710 

CCCACTAGGC GATCC'T AGAT TGGTAATAAC 

•4750 4760 4770 

AGACTGGC AT CTCGGTCAGG GAGTCTCCAT 

4810 4820 4830 

AGTACACCCT CAACT AGCAG ACCAACTAAT 

4870 4880 4890 



a3r* 


/AiTi'.:.;.,; 


; - ' G U L c. ,* A 


)700 


3710 


3 720 


AAAGTACTAT 


TTTTACATGG 


A AT AG A r 


3 760 


3770 


3 780 


AATTGGAGAC 


CAATCCCTAG 


TGAtTTTAAC 


3320 


3R30 


3840 


GCCAGCTGTC 


AfAAAfGTCA 


GCTAAAAGGA 


3880 


3890 


3900 


CCAGGAATAT 


GGCAACT AGA 


TTGTACAC AT 


3940 


3950 


3<960 


CATGTAGCC A 


GTGGATATAT 


AGAAGCACA A 


4000 


4010 


4020 


GCATACTTTC 


TTTTAAAATT 


AGCAGGAAGA 


4060 


4070 


4080 


GGCAGCAATT 


TCACCAGTAC 


TACCGTTAAG 


4120 


4130 


4140 


GAATTTGGAA 


TTCCCTACAA 


TCCCCAAAGT 


4180 


4190 


4200 


TTAAAGAAAA 


TTATAGGCCA 


GGTAAGAGAT 


4240 


4250 


4260 


ATGGCAGTAT 


TCATCCACAA 


TTTTAAAAGA 


4 300 


4310 


4320 


GAAAGAATAG 


TAGACATAAT 


AGCAACAGAC 


4360 


4370 


4380 


ACAAAAATTC 


AAAATTTTCG 


GGTTTATTAC 


4420 


4430 


4440 


CCAGCAAAGC 


TCCTCTGGAA 


AGGTGAAGGG 


4480 


4490 


4500 


AAAGTAGTGC 


CAAGAAGAAA 


ACCAAAG ATC 


4540 


4550 


4560 


GATGATTGTC 


TGGCAAGTAG 


ACAGGATGAG 


4600 


4610 


4620 


CCATATGTAT 


GTTTCAGGGA 


AAGCT AGGGG 


4660 


4670 


4680 


TCATCCAACA 


ATAAGTTCAG 


AAGTACACAT 


4720 


4730 


4740 


AACATATTGG 


GGTCTGCATA 


CAGGACA AAC 


4780 


4790 


4800 


AGAATGGAGG 


AAAAAGAGAT 


ATAGCACACA 


4840 


4650 


48bO 


TCATCTCTAT 


TACTTTCACT 


CTTTTTCAGA 


4900 


4910 


4920 




«,910 «»9*0 D F A*<*50 <»vh0 4970 *980 

aggacataac aagctaggat ct:tacaata CTTCCCACTA GCACCATTAA TAACACCAAA 

V)90 3000 5010 5020 5030 50*0 

A AAGAT4AAG CCACCTTTGC CT ACTC TTAC C AAACTCACA CACGAf ACAT GGAACAAGCC 

50S0 5060 5070 50H0 5090 5100 

CC AGAAGACC AAGGGCC AC A CACGG ACCC A CACAATGAAT GGACACTAGA GCTTTTAGAG 

5110 5120 5130 31*0 5150 5160 

GAGCTTAAGA ATGAAGCTGT TAGACATTTT CCT ACGATTT GGCTCCATGG CTTAGGGCA A 

5170 5180 5190 5200 5210 5220 

CATA TCTATG AAACTTA TGG GGATACTTGG GCAOGACrGC AAGCCATAAT AAGAATTCTG 

5230 5240 5250 5260 5270 5280 

CAACAACTGC TGTTTATCCA TTTCAG AATT GGGTGTCGAC ATAGCAGAAT AGGCGTTACT 

5290 5300 5310 5320 5330 5340 

CAACACACGA CAGCAAGAAA TCGACCCACT AGATCCTAGA CTAGAGCCCT GCAACCATCC 

5330 5360 5370 5380 5390 5400 

AGGAAGTCAG CCT AAAACTC CTTCTACCAC TTGCTATTCT AAAAAGTGTT GCTTTCATTG 

5410 5420 5430 5440 5450 5460 

CCAAGTTTGT TTC ACAACAA AAGCCTTAGG CATCTCCTAT GGCAGGAAGA AGCGCAGACA 

5470 5480 5490 5500 5510 5520 

GCGACGAAGA CCTCCTCAAG CCAGTCACAC TCATCAAGTT TCTCTATCAA AGCAGTAAGT 

5530 5540 ' 5550 5560 5570 5580 

AGTACATGTA ATGCAACCT A TACAAATACC AATAGCAGCA TTAGTAGTAG CAATAATAAT 

5590 5600 5610 5620 5630 5640 

AGCAATAGTT GTGTGGTCCA TAGTAATCAT AGAATATAGG AAAATATTAA GACAAAGAAA 

5650 5660 5670 5660 5690 3700 

AATAGACAGG TTAATTGATA CACTAATAGA AAGAGCAGAA GACAGTGGCA ATGAGAGTCA 

5710 5720 5730 5740 5750 5760 

AGGAGAAATA TCAGCACTTG TGGAGATGGG GGTGGAAATG GGGCACCATG CTCCTTGGGA 

5770 5760 5790 5600 5810 5620 

TATTGATCAT CTGTAGTGCT ACAGAAAAAT TGTGCGTCAC AGTCTATTAT CGGGTACCTC 

5830 5840 5850 5660 5870 5660 

TGTGGAAGGA AGCAACCACC ACTCTATTTT GTGCATC AGA TGCT AAAGCA TATGA fACAG 

5890 5900 5910 5920 5930 5940 

ACGTACATAA TGTTTGGGCC ACACATGCCT GTGTACCCAC AGACCCCAAC CCACAAGAAC 

5950 5960 3970 5980 5990 " 6000 

TAGTATTGGT AAATG*TGAC A GAAAATTTTA ACATGTGGAA AAATGACATO CTAGAACACA 

6010 6020 6030 6040 6050 * 6060 

TGCATGACGA TATAATCAGT TTATGGGATC AAAGCCTAAA GCCATGTGTA AAATTAACCC 

6070 6080 6090 6100 6110 6120 

CACTCTGTGT TAGTTTAAAG TGCACTCATT TCCC6AATCC TACTAATACC AATACTA6TA 

6130 6140 6X50 6160 6170 "* ,6160 
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ATACCAAMG TAGTAOCCCC GAAATCATGA TGGAGAAAGG AGAGATAAAA AACTrCTCrr 

D F A 

^ 6100 6200 6?10 6220 6230 62*»d|^ 

fCAATATCAG CACAAGCATA AGAGGTAACG TGCACAAACA AT ATGCATTT TTTTATAAAC 

6250 . 6260 6270 6280 6290 6300 

ttcatataat accaatagat aatuatacta ccagctatac gttgacaact tctaacacct 

6310 6320 6330 6340 6390 6360 

C AGTCATT AC ACAGCCCTCT CC AAAGGTAT CCTTTGACCC AATTCCCATA CATTATTCTG 

6370 6330 6390 6400 6410 6420 

CCCCCCCTCG TTTTCCCATT CT AAA ATGTA ATAAT A AG AC GTTCAATCCA ACAGCACCAT 

6430 6440 6450 6460 6470 6460 

GTACAAATGT CAGCACAGTA CAATCTACAC ATGGAATTAG GCCAGTAGTA TCAACTC A AC 

6490 6500 6510 6520 6530 6540 

TGCTGTTGAA TGGCAGTCTA GCACAACAAC AGGTAGTAAT TAGATCTGCC AATTTCACAG 



6550 


6560 


6570 


6560 


6590 


6600 


ACAATGCTAA 


AACCATAATA 


GT ACAGCTGA 


ACCAATCTGT 


AGAAATTAAT 


TGTACAACAC 


6610 


6620 


6630 


6640 


6650 


6660 


CC AACAACAA 


TACAAGAAAA 


AGTATCCGTA 


TCCAGAGGGG 


ACCAGGGAGA 


GCATTTGTTA 


6670 


6660 


6690 


6700 


6710 


6720 


CAATAGGAAA 


A ATAGGAAAT 


ATGAGACA AG 


CACATTGTAA 


CATTA6TAGA 


GCAAAATGGA 


6730 


6740 


6750 


6760 


6770 


6760 


ATGCCACTTT 


AAAACAG ATA 


GCTAGCAAAT 


TAAGAGAACA 


ATTTGGAAAT 


AATAAAACA A 


6790 


6800 


6610 


6620 


6630 


66 40 


TAATCTTTAA 


GCAATCCTCA 


GGAGGGCACC 


CAGAAATTGT 


AACGCACAGT 


TTTAATTGTG 


6050 


6660 


6670 


6680 


6690 


6900 


GAGGGGAATT 


TTTCTACTGT 


AATTCA ACAC 


AACTGTTTAA 


TAGTACTTCG 


TTTAATACTA 


6910 


6920 


6930 


6940 


6950 


6960 


CTTGGAGTAC 


TGA AGGGTC A 


AAT AAw ACTC 


A AwGAAvTUA 


CACAATCACA 


CTCCCATGCA 


6970 


6960 


6990 


7000 


7010 


7020 


GAATAAAACA 


ATTTATAAAC 


ATGTGGCAGG 


AAGTAGGAAA 


AGCAATGTAT 


GCCCCTCCCA 


7030 


7040 


7050 


7060 


7070 


7080 


TCAGCGGACA 


AATTAGATGT 


TCATCAAATA 


TTACAGGGCT 


GCTATTAACA 


AGAGATGGTG 


7090 


7100 


7110 


7120 


7130 


7140 


CTAATAACAA 


CAATGGGTCC 


GAGATCTTCA 


GACCTGGAGG 


AGGAGATATC 


AGGGACAATT 


7150 


7160 


7170 


7160 


7190 


7200 


GGAGAAGTGA 


ATT ATAT AAA 


TATAAAGTAG 


TAAAAATTGA 


ACCATTAGGA 


CTAGCACCCA 


7210 


•7220 


7230 


7240 


7250 


7260 


CC AAGGCAAA 


GAGAAGAGTG 


GTGCAG AGAG 


AAAAAAGAGC 


AGTGGGAATA 


GGAGCTTTGT 


7270 


7280 


7290 


7300 


7310 


7320 


TCCTTGGGTT 


CTTGGGAGCA 


GCAGG AAGCA 


CTATGGGCCC 


ACGGTCAATG 


ACGCTGACGC 


7330 


7340 


7350 


7360 


7370 


7380 


TACAGGCC AG 


ACA ATTATTG 


TCTCGTATAG 


TGCAGCAGCA 


GAACAATTTG^CTGAGGGCTA 


7390 


7400 


7410 


7420 


7430 


7440 
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.4«CGCA tCAuCirOC FTCC.-.ACTCA CAOrCTCCCC CATC4A0CAC CTCCACCCAA 

^ 0 F A 

W 7 **0 7 ***>0 7^70 7*80 7*90 7500 

CaATCCTCCC TG T GG A A AG A TACUAAAGG ATCAACACCT CCTCCCCATT TGGGGTTGCT 

7510 7520 7510 75*0 7550 7560 

CTGGAAAACT CATTTGCACC ACTGCTGTGC CTTGGAATGC TAGTTGGAGT AATAAATCTC 

7370 7580 7590 7600 7610 7620 

rGG AACAGAT TTGGAAT AAC ATGACCTGGA TCGACTGGGA CAGAGAAATT AACAATTAC A 

7610 76*0 7650 7660 7670 7680 

CA agcttaat acattcctta attgaacaat cgcaaaacca gcaagaaaag aatgaacaag 

7690 7700 7710 7720 7730 77*0 

AATTATTGGA attagataaa tgggcaagtt tgtgcaattg ctttaacata acaaattcgc 

77S0 7760 7770 7780 7790 7800 

TGTGGTATAT AAAAATATTC ataatgatag tagcaggctt ggtaggttta agaatagttt 

7810 7820 7830 78*0 7650 7660 

TTGCTGTACT TTCf ATAGTG AATACACTTA GCCACCGATA TTCACCATTA TCGTTTCAGA 

7370 7680 7890 7900 7910 7920 

CCCACCTCCC AACCCCGAGG GGACCCGACA GGCCCGAAGG AATAGAAGAA GAAGGTGCAC 

7930 79*0 7950 7960 7970 7980 

AGAGAGACAG AGACAGATCC ATTCGATTAG TGAACGGATC CTTAGCACTT ATCTGGGACG 

7990 8000 8010 6020 8030 60*0 

ATCTGCCGAG CCTTCTGCCT CTTCAGCTAC CACCGCTTCA GAGACTTACT CTTGATTGTA 

8050 8060 6070 6080 8090 6100 

ACGAGGATTG TGGAACTTCT GGGACGCAGG GGGTGGGAAG CCCTCAAATA TTGGTGCAAT 

8110 6120 6130 81*0 8150 8160 

CTCCTACAGT ATTGGAGTCA GGAACTAAAG AATAGTGCTG TTACCTTGCT CAATCCCACA 

6170 6160 8190 8200 8210 8220 

GCC ATAGCAG TAGCTGACGG CACACATACG GTTATAGAAG TAGTACAAGG AGCTTGTAGA 

6230 6240 6250 6260 6270 6260 

GCTATTCGCC ACATACCTAG AA6AATAAGA CAGGCCTTG6 AAAGGATTTT GCTATAA6AT 

6290 8300 8310 8320 8330 63*0 

GGGTGGGAAG TGGTCAAAAA GT AGTGTGGT TGGATGGCCT ACTGTAAGGG AAAGA ATGAG 

6350 6360 8370 6360 8390 8400 

ACG AGCTGAG CCAGCAGC AG ATGGGGTGGG AGCAGCATCT CCACACCTCG AAAAACATGG 

8*10 8420 6430 8440 8450 8460 

AGCAATCACA ACT AGCAATA CAGCAGCTAC CAATGCTGCT TGTGCCTGGC TAGAAGCACA 

8*70 8*80 8490 6500 6910 8520 

AGAGGACCAC GAGGT6GCTT TTCCACTCAC ACCTCAGGTA CCTTTAAGAC CAATGACTTA 

6530 6540 8550 8560 6570 8580 

CAACGCAGCT GTAGATCTTA GCC ACTTTTT AAAAGAAAAG CGGCGACT6G AAGGGCTAAT 

6590 8600 8610 6620 8630 8640 

TCACTCCCAA CGAAGAC AAC ATATCCTTCA TCTGT6GATC TACCACACAC UCCCTACTT 

8650 6660 6670 8680 8690 8700 
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CCClwAT CACAACtACA CACCAGGGCC AGGGGTCaCA TATCCACTCA CCTTTCGAT 



9710 9720 3730 8740 8750 

GTGC TACAAG CTAGTACCAG HGAGCCAGA TAAGGTAGAA GAGGCCAATA 



9760 
AACGAGAGAA 



8770 8780 3790 8800 8810 8820 

CACC AGCTTG rrACACCCTG TGAGCCTGCA TGGAATGGAT GACCCTGAGA CAGAACTCTT 

8810 8840 3650 8660 3670 8680 

AG AGTGGAGG tttcacagcc gcctagcatt tcatcacgtg gcccgagacc TGCATCCGGA 

8890 8900 3910 6920 8930 89*0 

ctacttcaag aactcctgac atccaccttg ctacaaggga ctttccgctc gggactttcc 

8950 8960 8970 8980 8990 * 9 000 

AGGGAGGCGT ggcctgggcg gaactgggga gtggcgagcc ctcagatgct gcatataacc 

9010 9020 9030 9040 9050 9060 

AGCTCCTTTT TGCCTCTACT GCGTCTCTCT GGTTAGACCA GATTTGAGCC TGGGAGCTCT 



9070 9080 9090 9100 

CTGGCT AACT AGGGAACCCA CTGCTtAAGC CTCAATAAAG CTT 



